Overview

Dataset statistics

Number of variables15
Number of observations5336
Missing cells5140
Missing cells (%)6.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory625.4 KiB
Average record size in memory120.0 B

Variable types

NUM15

Reproduction

Analysis started2021-05-20 19:43:57.163726
Analysis finished2021-05-20 19:44:41.739270
Duration44.58 seconds
Versionpandas-profiling v2.7.1
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml
original_matur_years is highly correlated with years_to_maturHigh correlation
years_to_matur is highly correlated with original_matur_yearsHigh correlation
age_loan_years is highly correlated with idHigh correlation
id is highly correlated with age_loan_yearsHigh correlation
NUMBER_OF_FAMILY_MEMBERS has 1285 (24.1%) missing values Missing
FIXED_MONTHLY_EXPENSES has 1285 (24.1%) missing values Missing
INCOME_houshold has 1285 (24.1%) missing values Missing
dpd has 1285 (24.1%) missing values Missing
dpd is highly skewed (γ1 = 23.46348154) Skewed
id is uniformly distributed Uniform
id has unique values Unique
outstanding_volume has unique values Unique
planned_installments has 87 (1.6%) zeros Zeros
prepaid_amount has 371 (7.0%) zeros Zeros
NUMBER_OF_FAMILY_MEMBERS has 2064 (38.7%) zeros Zeros
FIXED_MONTHLY_EXPENSES has 2807 (52.6%) zeros Zeros
INCOME_houshold has 2701 (50.6%) zeros Zeros
dpd has 1258 (23.6%) zeros Zeros

Variables

id
Real number (ℝ≥0)

HIGH CORRELATION
UNIFORM
UNIQUE
Distinct count5336
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2675.9913793103447
Minimum1
Maximum5355
Zeros0
Zeros (%)0.0%
Memory size41.8 KiB

Quantile statistics

Minimum1
5-th percentile267.75
Q11336.75
median2672.5
Q34015.25
95-th percentile5088.25
Maximum5355
Range5354
Interquartile range (IQR)2678.5

Descriptive statistics

Standard deviation1547.065009
Coefficient of variation (CV)0.5781277999
Kurtosis-1.200888697
Mean2675.991379
Median Absolute Deviation (MAD)1339.5
Skewness0.003123542139
Sum14279090
Variance2393410.141
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2047 1 < 0.1%
 
585 1 < 0.1%
 
2636 1 < 0.1%
 
589 1 < 0.1%
 
4687 1 < 0.1%
 
2640 1 < 0.1%
 
593 1 < 0.1%
 
4691 1 < 0.1%
 
2644 1 < 0.1%
 
597 1 < 0.1%
 
Other values (5326) 5326 99.8%
 
ValueCountFrequency (%) 
1 1 < 0.1%
 
2 1 < 0.1%
 
3 1 < 0.1%
 
4 1 < 0.1%
 
5 1 < 0.1%
 
ValueCountFrequency (%) 
5355 1 < 0.1%
 
5354 1 < 0.1%
 
5353 1 < 0.1%
 
5352 1 < 0.1%
 
5351 1 < 0.1%
 

date_str
Real number (ℝ≥0)

Distinct count287
Unique (%)5.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20173700.114055436
Minimum20160131.0
Maximum20192740.5
Zeros0
Zeros (%)0.0%
Memory size41.8 KiB

Quantile statistics

Minimum20160131
5-th percentile20162037.5
Q120167766.11
median20172184.26
Q320176660.42
95-th percentile20192184.15
Maximum20192740.5
Range32609.5
Interquartile range (IQR)8894.312857

Descriptive statistics

Standard deviation8580.898157
Coefficient of variation (CV)0.0004253507343
Kurtosis-0.07460327278
Mean20173700.11
Median Absolute Deviation (MAD)4476.16359
Skewness0.8084568429
Sum1.076468638e+11
Variance73631813.18
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
20176660.42 471 8.8%
 
20172184.26 143 2.7%
 
20192347.17 140 2.6%
 
20171706.71 134 2.5%
 
20169011.06 125 2.3%
 
20170062.79 123 2.3%
 
20168620.4 123 2.3%
 
20167297.07 118 2.2%
 
20167766.11 117 2.2%
 
20170379 116 2.2%
 
Other values (277) 3726 69.8%
 
ValueCountFrequency (%) 
20160131 6 0.1%
 
20160180 13 0.2%
 
20160230.33 7 0.1%
 
20160280.25 11 0.2%
 
20160330.4 18 0.3%
 
ValueCountFrequency (%) 
20192740.5 14 0.3%
 
20192530.45 59 1.1%
 
20192347.17 140 2.6%
 
20192184.15 94 1.8%
 
20192037.5 49 0.9%
 

years_to_matur
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count5174
Unique (%)97.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.49482295484015
Minimum0.2225
Maximum34.92460000000001
Zeros0
Zeros (%)0.0%
Memory size41.8 KiB

Quantile statistics

Minimum0.2225
5-th percentile1.366416667
Q12.725819444
median6.628365385
Q314.78214489
95-th percentile25.2825931
Maximum34.9246
Range34.7021
Interquartile range (IQR)12.05632544

Descriptive statistics

Standard deviation8.189771221
Coefficient of variation (CV)0.8625512303
Kurtosis-0.300599814
Mean9.494822955
Median Absolute Deviation (MAD)4.460735755
Skewness0.9401670615
Sum50664.37529
Variance67.07235265
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
19.51 6 0.1%
 
19.26444444 5 0.1%
 
19.346875 5 0.1%
 
9.298235294 4 0.1%
 
29.50583333 4 0.1%
 
9.47 4 0.1%
 
9.258888889 3 0.1%
 
23.14288889 3 0.1%
 
1.723548387 3 0.1%
 
9.505833333 3 0.1%
 
Other values (5164) 5296 99.3%
 
ValueCountFrequency (%) 
0.2225 1 < 0.1%
 
0.252 1 < 0.1%
 
0.2766666667 1 < 0.1%
 
0.335 1 < 0.1%
 
0.35 1 < 0.1%
 
ValueCountFrequency (%) 
34.9246 1 < 0.1%
 
34.91166667 1 < 0.1%
 
34.53583333 1 < 0.1%
 
34.21789474 1 < 0.1%
 
34.1508 1 < 0.1%
 

age_owner_years
Real number (ℝ≥0)

Distinct count3267
Unique (%)61.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean46.766936024571656
Minimum22.270000000000003
Maximum74.60235294117648
Zeros0
Zeros (%)0.0%
Memory size41.8 KiB

Quantile statistics

Minimum22.27
5-th percentile32.40192513
Q141.72976364
median49.71574176
Q350.3825
95-th percentile59.11730691
Maximum74.60235294
Range52.33235294
Interquartile range (IQR)8.652736364

Descriptive statistics

Standard deviation7.690237373
Coefficient of variation (CV)0.1644374857
Kurtosis0.3995049449
Mean46.76693602
Median Absolute Deviation (MAD)1.334883242
Skewness-0.3508125672
Sum249548.3706
Variance59.13975086
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
49.8825 33 0.6%
 
50.80026316 25 0.5%
 
49.549 25 0.5%
 
49.50741935 23 0.4%
 
50.59186047 21 0.4%
 
49.67407407 21 0.4%
 
50.46673913 21 0.4%
 
50.75871795 20 0.4%
 
49.79916667 20 0.4%
 
50.67536585 18 0.3%
 
Other values (3257) 5109 95.7%
 
ValueCountFrequency (%) 
22.27 1 < 0.1%
 
23.76461538 1 < 0.1%
 
23.94785714 1 < 0.1%
 
25.08571429 1 < 0.1%
 
25.19363636 1 < 0.1%
 
ValueCountFrequency (%) 
74.60235294 1 < 0.1%
 
73.78386364 1 < 0.1%
 
73.06166667 1 < 0.1%
 
71.80268293 1 < 0.1%
 
71.47756098 1 < 0.1%
 

original_matur_years
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count2340
Unique (%)43.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.71496064467766
Minimum2.4600000000000004
Maximum41.99
Zeros0
Zeros (%)0.0%
Memory size41.8 KiB

Quantile statistics

Minimum2.46
5-th percentile3.03
Q15.02
median10.05
Q320.11
95-th percentile30.5625
Maximum41.99
Range39.53
Interquartile range (IQR)15.09

Descriptive statistics

Standard deviation9.236201394
Coefficient of variation (CV)0.6734398759
Kurtosis-0.6812482264
Mean13.71496064
Median Absolute Deviation (MAD)5.07
Skewness0.7185967562
Sum73183.03
Variance85.30741619
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
5 128 2.4%
 
5.03 98 1.8%
 
4.97 80 1.5%
 
3 79 1.5%
 
10 61 1.1%
 
10.01 55 1.0%
 
4 38 0.7%
 
5.01 38 0.7%
 
5.01 37 0.7%
 
10.01 35 0.7%
 
Other values (2330) 4687 87.8%
 
ValueCountFrequency (%) 
2.46 1 < 0.1%
 
2.72 1 < 0.1%
 
2.9 1 < 0.1%
 
2.96 1 < 0.1%
 
2.96 5 0.1%
 
ValueCountFrequency (%) 
41.99 1 < 0.1%
 
41.96 1 < 0.1%
 
41.41 1 < 0.1%
 
40.61 1 < 0.1%
 
40.5 1 < 0.1%
 

client_rate
Real number (ℝ≥0)

Distinct count1312
Unique (%)24.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.048790161169415286
Minimum0.02059999999999998
Maximum0.09800000000000003
Zeros0
Zeros (%)0.0%
Memory size41.8 KiB

Quantile statistics

Minimum0.0206
5-th percentile0.0269
Q10.0357
median0.0442
Q30.0579
95-th percentile0.0879
Maximum0.098
Range0.0774
Interquartile range (IQR)0.0222

Descriptive statistics

Standard deviation0.01775843674
Coefficient of variation (CV)0.3639757754
Kurtosis0.5122280985
Mean0.04879016117
Median Absolute Deviation (MAD)0.0116
Skewness0.9443781656
Sum260.3443
Variance0.0003153620756
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0.0579 154 2.9%
 
0.068 145 2.7%
 
0.073 94 1.8%
 
0.0579 59 1.1%
 
0.0579 53 1.0%
 
0.0529 51 1.0%
 
0.0344 48 0.9%
 
0.068 47 0.9%
 
0.0579 45 0.8%
 
0.0579 43 0.8%
 
Other values (1302) 4597 86.2%
 
ValueCountFrequency (%) 
0.0206 2 < 0.1%
 
0.0207 1 < 0.1%
 
0.0207 4 0.1%
 
0.0208 5 0.1%
 
0.021 1 < 0.1%
 
ValueCountFrequency (%) 
0.098 17 0.3%
 
0.098 14 0.3%
 
0.098 12 0.2%
 
0.098 34 0.6%
 
0.098 25 0.5%
 

original_volume
Real number (ℝ≥0)

Distinct count2100
Unique (%)39.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean165086.50153121963
Minimum1037.0
Maximum3060751.590000002
Zeros0
Zeros (%)0.0%
Memory size41.8 KiB

Quantile statistics

Minimum1037
5-th percentile13481
Q151850
median114070
Q3213259.05
95-th percentile497760
Maximum3060751.59
Range3059714.59
Interquartile range (IQR)161409.05

Descriptive statistics

Standard deviation172615.0968
Coefficient of variation (CV)1.045603942
Kurtosis25.96738691
Mean165086.5015
Median Absolute Deviation (MAD)72590
Skewness3.344493297
Sum880901572.2
Variance2.979597166e+10
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
103700 258 4.8%
 
155550 134 2.5%
 
51850 130 2.4%
 
207400 128 2.4%
 
72590 103 1.9%
 
82960 99 1.9%
 
124440 93 1.7%
 
62220 85 1.6%
 
41480 75 1.4%
 
20740 75 1.4%
 
Other values (2090) 4156 77.9%
 
ValueCountFrequency (%) 
1037 1 < 0.1%
 
2696.2 1 < 0.1%
 
3836.9 1 < 0.1%
 
4148 3 0.1%
 
4251.7 1 < 0.1%
 
ValueCountFrequency (%) 
3060751.59 1 < 0.1%
 
2074000 1 < 0.1%
 
1742160 1 < 0.1%
 
1659200 1 < 0.1%
 
1555500 2 < 0.1%
 

age_loan_years
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count4887
Unique (%)91.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.220157159194571
Minimum0.01
Maximum16.12
Zeros0
Zeros (%)0.0%
Memory size41.8 KiB

Quantile statistics

Minimum0.01
5-th percentile0.5623076923
Q11.825432331
median3.406742424
Q36.462611434
95-th percentile9.479347826
Maximum16.12
Range16.11
Interquartile range (IQR)4.637179103

Descriptive statistics

Standard deviation2.916417291
Coefficient of variation (CV)0.6910684084
Kurtosis-0.3364997565
Mean4.220157159
Median Absolute Deviation (MAD)2.040274621
Skewness0.6713532068
Sum22518.7586
Variance8.505489816
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0.5276923077 8 0.1%
 
0.5576923077 8 0.1%
 
0.7252941176 8 0.1%
 
0.5492307692 8 0.1%
 
0.5669230769 8 0.1%
 
0.5066666667 7 0.1%
 
0.4758333333 7 0.1%
 
0.5776923077 7 0.1%
 
0.49 7 0.1%
 
0.4941666667 7 0.1%
 
Other values (4877) 5261 98.6%
 
ValueCountFrequency (%) 
0.01 1 < 0.1%
 
0.09 1 < 0.1%
 
0.1 2 < 0.1%
 
0.1 1 < 0.1%
 
0.1266666667 1 < 0.1%
 
ValueCountFrequency (%) 
16.12 1 < 0.1%
 
15.76727273 1 < 0.1%
 
15.19170732 1 < 0.1%
 
14.49342857 1 < 0.1%
 
14.43 1 < 0.1%
 

outstanding_volume
Real number (ℝ≥0)

UNIQUE
Distinct count5336
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean101980.55702776244
Minimum391.9171428571429
Maximum1666664.0092000002
Zeros0
Zeros (%)0.0%
Memory size41.8 KiB

Quantile statistics

Minimum391.9171429
5-th percentile5775.311763
Q124140.1645
median63205.05775
Q3137106.7146
95-th percentile325535.9448
Maximum1666664.009
Range1666272.092
Interquartile range (IQR)112966.5501

Descriptive statistics

Standard deviation117990.9137
Coefficient of variation (CV)1.156994207
Kurtosis17.01850308
Mean101980.557
Median Absolute Deviation (MAD)47050.52624
Skewness3.003482294
Sum544168252.3
Variance1.392185573e+10
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
155331.6133 1 < 0.1%
 
91924.08429 1 < 0.1%
 
35419.98368 1 < 0.1%
 
383211.6678 1 < 0.1%
 
227670.3665 1 < 0.1%
 
38210.8505 1 < 0.1%
 
223965.9823 1 < 0.1%
 
391.9171429 1 < 0.1%
 
289171.1658 1 < 0.1%
 
45975.8552 1 < 0.1%
 
Other values (5326) 5326 99.8%
 
ValueCountFrequency (%) 
391.9171429 1 < 0.1%
 
974.498 1 < 0.1%
 
983.2952941 1 < 0.1%
 
1033.65 1 < 0.1%
 
1037 1 < 0.1%
 
ValueCountFrequency (%) 
1666664.009 1 < 0.1%
 
1280634.131 1 < 0.1%
 
1168332.368 1 < 0.1%
 
1155583.71 1 < 0.1%
 
1083240.647 1 < 0.1%
 

planned_installments
Real number (ℝ≥0)

ZEROS
Distinct count5242
Unique (%)98.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean826.6141932608041
Minimum0.0
Maximum14381.5572972973
Zeros87
Zeros (%)1.6%
Memory size41.8 KiB

Quantile statistics

Minimum0
5-th percentile119.8299653
Q1318.4323333
median568.4001935
Q3992.8168482
95-th percentile2339.158936
Maximum14381.5573
Range14381.5573
Interquartile range (IQR)674.3845149

Descriptive statistics

Standard deviation937.8308729
Coefficient of variation (CV)1.134544846
Kurtosis39.36086509
Mean826.6141933
Median Absolute Deviation (MAD)301.5206234
Skewness4.792225208
Sum4410813.335
Variance879526.7462
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 87 1.6%
 
864.16 5 0.1%
 
1728.34 3 0.1%
 
432.09 2 < 0.1%
 
1296.25 2 < 0.1%
 
134.5344 1 < 0.1%
 
107.2725 1 < 0.1%
 
1578.916538 1 < 0.1%
 
625.6085 1 < 0.1%
 
424.1090476 1 < 0.1%
 
Other values (5232) 5232 98.1%
 
ValueCountFrequency (%) 
0 87 1.6%
 
12.9044 1 < 0.1%
 
14.85857143 1 < 0.1%
 
15.63 1 < 0.1%
 
26.57666667 1 < 0.1%
 
ValueCountFrequency (%) 
14381.5573 1 < 0.1%
 
12836.15529 1 < 0.1%
 
11693.30222 1 < 0.1%
 
11422.21727 1 < 0.1%
 
10909.01757 1 < 0.1%
 

prepaid_amount
Real number (ℝ≥0)

ZEROS
Distinct count4605
Unique (%)86.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2305.9657315674654
Minimum0.0
Maximum204997.79
Zeros371
Zeros (%)7.0%
Memory size41.8 KiB

Quantile statistics

Minimum0
5-th percentile0
Q1230.4444444
median821.7166327
Q32357.514952
95-th percentile7692.518359
Maximum204997.79
Range204997.79
Interquartile range (IQR)2127.070508

Descriptive statistics

Standard deviation7089.313809
Coefficient of variation (CV)3.07433615
Kurtosis366.9873698
Mean2305.965732
Median Absolute Deviation (MAD)729.1215486
Skewness16.3062481
Sum12304633.14
Variance50258370.29
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 371 7.0%
 
414.8 28 0.5%
 
1037 14 0.3%
 
622.2 12 0.2%
 
1244.4 12 0.2%
 
518.5 11 0.2%
 
207.4 10 0.2%
 
829.6 10 0.2%
 
311.1 10 0.2%
 
2074 10 0.2%
 
Other values (4595) 4848 90.9%
 
ValueCountFrequency (%) 
0 371 7.0%
 
2.728947368 1 < 0.1%
 
3.013846154 1 < 0.1%
 
3.988461538 2 < 0.1%
 
4.320833333 1 < 0.1%
 
ValueCountFrequency (%) 
204997.79 1 < 0.1%
 
188878.56 1 < 0.1%
 
172237.7067 1 < 0.1%
 
162809 1 < 0.1%
 
114417.502 1 < 0.1%
 

NUMBER_OF_FAMILY_MEMBERS
Real number (ℝ≥0)

MISSING
ZEROS
Distinct count9
Unique (%)0.2%
Missing1285
Missing (%)24.1%
Infinite0
Infinite (%)0.0%
Mean1.2001974821031844
Minimum0.0
Maximum50.0
Zeros2064
Zeros (%)38.7%
Memory size41.8 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q32
95-th percentile4
Maximum50
Range50
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.812892092
Coefficient of variation (CV)1.510494831
Kurtosis257.8625276
Mean1.200197482
Median Absolute Deviation (MAD)0
Skewness10.11923558
Sum4862
Variance3.286577739
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 2064 38.7%
 
2 675 12.6%
 
1 509 9.5%
 
3 390 7.3%
 
4 338 6.3%
 
5 63 1.2%
 
6 7 0.1%
 
8 3 0.1%
 
50 2 < 0.1%
 
(Missing) 1285 24.1%
 
ValueCountFrequency (%) 
0 2064 38.7%
 
1 509 9.5%
 
2 675 12.6%
 
3 390 7.3%
 
4 338 6.3%
 
ValueCountFrequency (%) 
50 2 < 0.1%
 
8 3 0.1%
 
6 7 0.1%
 
5 63 1.2%
 
4 338 6.3%
 

FIXED_MONTHLY_EXPENSES
Real number (ℝ≥0)

MISSING
ZEROS
Distinct count150
Unique (%)3.7%
Missing1285
Missing (%)24.1%
Infinite0
Infinite (%)0.0%
Mean365.76509898790425
Minimum0.0
Maximum11700.0
Zeros2807
Zeros (%)52.6%
Memory size41.8 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3390
95-th percentile1790.3795
Maximum11700
Range11700
Interquartile range (IQR)390

Descriptive statistics

Standard deviation775.3740087
Coefficient of variation (CV)2.119868765
Kurtosis26.03086803
Mean365.765099
Median Absolute Deviation (MAD)0
Skewness3.820677667
Sum1481714.416
Variance601204.8534
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 2807 52.6%
 
650 172 3.2%
 
1300 143 2.7%
 
390 89 1.7%
 
780 84 1.6%
 
1040 66 1.2%
 
1950 58 1.1%
 
1560 57 1.1%
 
130 55 1.0%
 
2600 51 1.0%
 
Other values (140) 469 8.8%
 
(Missing) 1285 24.1%
 
ValueCountFrequency (%) 
0 2807 52.6%
 
13 1 < 0.1%
 
26 1 < 0.1%
 
65 21 0.4%
 
78 2 < 0.1%
 
ValueCountFrequency (%) 
11700 1 < 0.1%
 
7800 2 < 0.1%
 
6760 1 < 0.1%
 
6500 6 0.1%
 
5443.88 1 < 0.1%
 

INCOME_houshold
Real number (ℝ)

MISSING
ZEROS
Distinct count1264
Unique (%)31.2%
Missing1285
Missing (%)24.1%
Infinite0
Infinite (%)0.0%
Mean4423.720887682054
Minimum-28130.388000000014
Maximum335624.6520000002
Zeros2701
Zeros (%)50.6%
Memory size41.8 KiB

Quantile statistics

Minimum-28130.388
5-th percentile0
Q10
median0
Q34345.614
95-th percentile20726.04
Maximum335624.652
Range363755.04
Interquartile range (IQR)4345.614

Descriptive statistics

Standard deviation12373.08711
Coefficient of variation (CV)2.796986389
Kurtosis153.4802834
Mean4423.720888
Median Absolute Deviation (MAD)0
Skewness8.738979445
Sum17920493.32
Variance153093284.7
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 2701 50.6%
 
3000 7 0.1%
 
7200 6 0.1%
 
3600 5 0.1%
 
6960 5 0.1%
 
10800 4 0.1%
 
2640 4 0.1%
 
8400 4 0.1%
 
2520 4 0.1%
 
7800 3 0.1%
 
Other values (1254) 1308 24.5%
 
(Missing) 1285 24.1%
 
ValueCountFrequency (%) 
-28130.388 1 < 0.1%
 
-3530.28 1 < 0.1%
 
-18.948 1 < 0.1%
 
0 2701 50.6%
 
720 1 < 0.1%
 
ValueCountFrequency (%) 
335624.652 1 < 0.1%
 
139273.416 1 < 0.1%
 
134381.7 1 < 0.1%
 
134381.7 1 < 0.1%
 
133655.664 2 < 0.1%
 

dpd
Real number (ℝ≥0)

MISSING
SKEWED
ZEROS
Distinct count921
Unique (%)22.7%
Missing1285
Missing (%)24.1%
Infinite0
Infinite (%)0.0%
Mean1.084392084942053
Minimum0.0
Maximum295.6111111111111
Zeros1258
Zeros (%)23.6%
Memory size41.8 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.0625
Q30.2989361702
95-th percentile3.411538462
Maximum295.6111111
Range295.6111111
Interquartile range (IQR)0.2989361702

Descriptive statistics

Standard deviation8.972407119
Coefficient of variation (CV)8.274135568
Kurtosis640.8029831
Mean1.084392085
Median Absolute Deviation (MAD)0.0625
Skewness23.46348154
Sum4392.872336
Variance80.50408952
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 1258 23.6%
 
0.02 92 1.7%
 
0.08333333333 80 1.5%
 
0.07692307692 61 1.1%
 
0.07142857143 52 1.0%
 
0.09090909091 51 1.0%
 
0.0625 46 0.9%
 
0.05555555556 36 0.7%
 
0.1 36 0.7%
 
0.05882352941 36 0.7%
 
Other values (911) 2303 43.2%
 
(Missing) 1285 24.1%
 
ValueCountFrequency (%) 
0 1258 23.6%
 
0.02 92 1.7%
 
0.02040816327 14 0.3%
 
0.02083333333 7 0.1%
 
0.02127659574 11 0.2%
 
ValueCountFrequency (%) 
295.6111111 1 < 0.1%
 
261.5365854 1 < 0.1%
 
230.8064516 1 < 0.1%
 
180.0540541 1 < 0.1%
 
163.3902439 1 < 0.1%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

Sample

First rows

iddate_stryears_to_maturage_owner_yearsoriginal_matur_yearsclient_rateoriginal_volumeage_loan_yearsoutstanding_volumeplanned_installmentsprepaid_amountNUMBER_OF_FAMILY_MEMBERSFIXED_MONTHLY_EXPENSESINCOME_housholddpd
012.018020e+079.82122045.88317125.010.034251850.0015.19170729288.497805209.6317070.0000004.01040.0000.0000.146341
122.017068e+075.80916757.49333321.930.0342155550.0016.12000075720.080833634.6150001907.9672220.00.0000.0000.027778
232.016518e+079.24909144.11318225.020.043270516.0015.76727339353.626364287.2577271636.9454553.01950.00019200.0000.000000
342.017218e+075.58076941.03179520.010.085031110.0014.43000014291.302821165.646667277.6658972.0260.0002508.4320.641026
452.017038e+074.77342955.78485719.270.0279336431.4314.49342931272.502857505.280857630.223143NaNNaNNaNNaN
562.017428e+0711.92590943.41500025.640.027972590.0013.71750030239.141591210.730000579.4940912.01757.4188111.1480.000000
672.017535e+0712.18595740.88170225.510.027941480.0013.32319120964.954894142.610000373.241702NaNNaNNaNNaN
782.016973e+0711.07575848.15151524.580.027974041.8013.49909134384.615758257.080000903.744848NaNNaNNaNNaN
892.017666e+074.55580065.90060018.480.0260125341.3913.92980037794.061400594.411800510.826200NaNNaNNaNNaN
9102.018305e+077.86411848.15647120.060.0259285175.0012.198824104801.4441181109.0200000.0000000.00.0000.0000.264706

Last rows

iddate_stryears_to_maturage_owner_yearsoriginal_matur_yearsclient_rateoriginal_volumeage_loan_yearsoutstanding_volumeplanned_installmentsprepaid_amountNUMBER_OF_FAMILY_MEMBERSFIXED_MONTHLY_EXPENSESINCOME_housholddpd
532653462.019253e+0719.19454545.97545519.700.0367350292.0718180.505455337311.9600000.0000000.0000005.02080.0000.0000.090909
532753472.019253e+0725.39727335.16363625.900.0359195370.8000000.505455195370.8000000.0000000.0000004.01560.0000.0000.000000
532853482.019253e+0724.75454535.88363625.250.0372131510.4545450.500000124345.7272730.0000000.0000000.00.0000.0000.454545
532953492.019253e+0726.72636430.45545527.220.0371248880.0000000.500000247251.742727356.2590910.0000001.0780.0002400.0000.000000
533053502.019274e+0713.90100039.62600014.440.0356223676.3020000.542000213838.411000314.5230000.0000000.00.0000.0000.100000
533153512.019253e+0719.85909138.54545520.360.0362227106.7709090.500000216174.031818482.816364424.2272734.0780.0000.0000.090909
533253522.019274e+0719.59400030.41400020.130.0356207400.0000000.542000204755.477000589.55400069.4310002.01755.0006338.7360.000000
533353532.019253e+079.93818235.25545510.440.037379189.0909090.50000071186.596364386.250000433.6545451.01300.0000.0000.090909
533453542.019253e+0729.64454529.55454530.140.0356362950.0000000.500000360658.036364517.6900000.0000002.01574.4175382.6120.000000
533553552.019253e+079.56090937.07909110.060.0369186660.0000000.495455174494.7600001249.769091787.6118181.01950.0000.0000.000000